Don't remove quotes if \ or " are present inside#2048
Merged
Byron merged 3 commits intogitpython-developers:mainfrom Jun 9, 2025
Merged
Don't remove quotes if \ or " are present inside#2048Byron merged 3 commits intogitpython-developers:mainfrom
\ or " are present inside#2048Byron merged 3 commits intogitpython-developers:mainfrom
Conversation
This refactors ConfigParser double-quote parsing near the single line double-quoted value parsing code, so that: - Code that parses the name is less intermixed with code that parses the value. - Conditional logic is less duplicated. - The `END` comment notation appears next to the code it describes. - The final `else` can be turned into one or more `elif` followed by `else` to cover different cases of `"..."` differently. (But those are not added here. This commit is purely a refactoring.) (The `pass` suite when `len(optval) < 2 or optval[0] != '"'` is awkward and not really justified right now, but it looks like it may be able to help with readabilty and help keep nesting down when new `elif` cases are added.)
These are cases where just removing the outer quotes without doing anything to the text inside does not give the correct result, and where keeping the quotes may be preferable, in that it was the long-standing behavior of `GitConfigParser`. That this was the long-standing behavior may justify bringing it back when the `"`-`"`-enclosed text contains such characters, but it does not justify preserving it indefinitely: it will still be better to parse the escape sequences, at least in the type case that all of them in a value's representation are well-formed.
This is for single line quoting in the ConfigParser. This leaves the changes in gitpython-developers#2035 (as adjusted in gitpython-developers#2036) intact for the cases where it addressed gitpython-developers#1923: when the `...` in `"..."` (appearing in the value position on a single `{name} = {value}"` line) has no occurrences of `\` or `"`, quote removal is enough. But when `\` or `"` does appear, this suppresses quote removal. This is with the idea that, while it would be better to interpret such lines as Git does, we do not yet do that, so it is preferable to return the same results we have in the past (which some programs may already be handling themselves). This should make the test introduced in the preceding commit pass. But it will be even better to support more syntax, at least well-formed escapes. As noted in the test, both the test and the code under test can be adjusted for that. (See comments in gitpython-developers#2035 for context.)
Byron
approved these changes
Jun 9, 2025
Member
Byron
left a comment
There was a problem hiding this comment.
Thanks a lot! To me this looks like a clear improvement, and the tests lay the foundation for further improvements if there is community interest or need.
At this point, I don't think there is anyone truly knowledgeable with this codebase anymore, and all I can say here is that the only good way to do such a parser is to write it from scratch. After all, setting up something on top of an INI parser is already incorrect.
Let's merge and fix issues as they arise.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
#2035 fixed issue #1923, where the ConfigParser would not remove the quotes around single-line values. As discussed in comments there:
Let's take the best of both worlds (so far)
This PR keeps the changes from #2035 in the case that they work because the text contained strictly between the beginning and ending
"characters contains neither any\nor any other". This both:\is meant to be preserved rather than treated as an escape character. This is presumably rare--if it ever happens--since that's not the syntax of double-quoted values in Git config files.)Changes
But it can get better than this
This is not intended as a long-term alternative to parsing escape sequences. The idea in #2035 (comment) of handling them is good, and this is not meant to discourage or interfere with that. The new test fixture and test can be modified accordingly. See the docstring and comments in
test_config_with_quotes_containing_escapes.For review
It seems to me that the idea here is sound, since it restores the main branch to a state where no changes are expected to produce problems for programs and libraries that use GitPython, if a patch release were to be made.
But even if I am right to think that, there are a few reasons it may be useful to have a review here before merging:
(This follows #2046 and #2047, which followed #2035 and #2036.)